{
    "cells": [
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "<i>Copyright (c) Recommenders contributors.</i>\n",
                "\n",
                "<i>Licensed under the MIT License.</i>"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# Running ALS on MovieLens (PySpark)\n",
                "\n",
                "Matrix factorization by [ALS](https://spark.apache.org/docs/latest/api/python/_modules/pyspark/ml/recommendation.html#ALS) (Alternating Least Squares) is a well known collaborative filtering algorithm.\n",
                "\n",
                "This notebook provides an example of how to utilize and evaluate ALS PySpark ML (DataFrame-based API) implementation, meant for large-scale distributed datasets. We use a smaller dataset in this example to run ALS efficiently on multiple cores of a [Data Science Virtual Machine](https://azure.microsoft.com/en-gb/services/virtual-machines/data-science-virtual-machines/)."
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "**Note**: This notebook requires a PySpark environment to run properly. Please follow the steps in [SETUP.md](../../SETUP.md) to install the PySpark environment."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 1,
            "metadata": {},
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": [
                        "System version: 3.8.0 (default, Nov  6 2019, 21:49:08) \n",
                        "[GCC 7.3.0]\n",
                        "Spark version: 3.2.0\n"
                    ]
                }
            ],
            "source": [
                "import warnings\n",
                "warnings.simplefilter(action='ignore', category=FutureWarning)\n",
                "\n",
                "import sys\n",
                "import pyspark\n",
                "from pyspark.ml.recommendation import ALS\n",
                "import pyspark.sql.functions as F\n",
                "from pyspark.sql import SparkSession\n",
                "from pyspark.sql.types import StructType, StructField\n",
                "from pyspark.sql.types import StringType, FloatType, IntegerType, LongType\n",
                "\n",
                "from recommenders.utils.timer import Timer\n",
                "from recommenders.datasets import movielens\n",
                "from recommenders.utils.notebook_utils import is_jupyter\n",
                "from recommenders.datasets.spark_splitters import spark_random_split\n",
                "from recommenders.evaluation.spark_evaluation import SparkRatingEvaluation, SparkRankingEvaluation\n",
                "from recommenders.utils.spark_utils import start_or_get_spark\n",
                "from recommenders.utils.notebook_utils import store_metadata\n",
                "\n",
                "print(f\"System version: {sys.version}\")\n",
                "print(\"Spark version: {}\".format(pyspark.__version__))\n"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Set the default parameters."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 2,
            "metadata": {
                "tags": [
                    "parameters"
                ]
            },
            "outputs": [],
            "source": [
                "# top k items to recommend\n",
                "TOP_K = 10\n",
                "\n",
                "# Select MovieLens data size: 100k, 1m, 10m, or 20m\n",
                "MOVIELENS_DATA_SIZE = '100k'\n",
                "\n",
                "# Column names for the dataset\n",
                "COL_USER = \"UserId\"\n",
                "COL_ITEM = \"MovieId\"\n",
                "COL_RATING = \"Rating\"\n",
                "COL_TIMESTAMP = \"Timestamp\""
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### 0. Set up Spark context\n",
                "\n",
                "The following settings work well for debugging locally on VM - change when running on a cluster. We set up a giant single executor with many threads and specify memory cap. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 3,
            "metadata": {},
            "outputs": [],
            "source": [
                "# the following settings work well for debugging locally on VM - change when running on a cluster\n",
                "# set up a giant single executor with many threads and specify memory cap\n",
                "spark = start_or_get_spark(\"ALS PySpark\", memory=\"16g\")\n",
                "spark.conf.set(\"spark.sql.analyzer.failAmbiguousSelfJoin\", \"false\")"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### 1. Download the MovieLens dataset"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 4,
            "metadata": {},
            "outputs": [
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.81k/4.81k [00:05<00:00, 882KB/s]\n",
                        "                                                                                \r"
                    ]
                },
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": [
                        "+------+-------+------+---------+\n",
                        "|UserId|MovieId|Rating|Timestamp|\n",
                        "+------+-------+------+---------+\n",
                        "|   196|    242|   3.0|881250949|\n",
                        "|   186|    302|   3.0|891717742|\n",
                        "|    22|    377|   1.0|878887116|\n",
                        "|   244|     51|   2.0|880606923|\n",
                        "|   166|    346|   1.0|886397596|\n",
                        "|   298|    474|   4.0|884182806|\n",
                        "|   115|    265|   2.0|881171488|\n",
                        "|   253|    465|   5.0|891628467|\n",
                        "|   305|    451|   3.0|886324817|\n",
                        "|     6|     86|   3.0|883603013|\n",
                        "|    62|    257|   2.0|879372434|\n",
                        "|   286|   1014|   5.0|879781125|\n",
                        "|   200|    222|   5.0|876042340|\n",
                        "|   210|     40|   3.0|891035994|\n",
                        "|   224|     29|   3.0|888104457|\n",
                        "|   303|    785|   3.0|879485318|\n",
                        "|   122|    387|   5.0|879270459|\n",
                        "|   194|    274|   2.0|879539794|\n",
                        "|   291|   1042|   4.0|874834944|\n",
                        "|   234|   1184|   2.0|892079237|\n",
                        "+------+-------+------+---------+\n",
                        "only showing top 20 rows\n",
                        "\n"
                    ]
                }
            ],
            "source": [
                "# Note: The DataFrame-based API for ALS currently only supports integers for user and item ids.\n",
                "schema = StructType(\n",
                "    (\n",
                "        StructField(COL_USER, IntegerType()),\n",
                "        StructField(COL_ITEM, IntegerType()),\n",
                "        StructField(COL_RATING, FloatType()),\n",
                "        StructField(COL_TIMESTAMP, LongType()),\n",
                "    )\n",
                ")\n",
                "\n",
                "data = movielens.load_spark_df(spark, size=MOVIELENS_DATA_SIZE, schema=schema)\n",
                "data.show()"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### 2. Split the data using the Spark random splitter provided in utilities"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 5,
            "metadata": {},
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": [
                        "N train 75018\n",
                        "N test 24982\n"
                    ]
                }
            ],
            "source": [
                "train, test = spark_random_split(data, ratio=0.75, seed=123)\n",
                "print (\"N train\", train.cache().count())\n",
                "print (\"N test\", test.cache().count())"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### 3. Train the ALS model on the training data, and get the top-k recommendations for our testing data\n",
                "\n",
                "To predict movie ratings, we use the rating data in the training set as users' explicit feedback. The hyperparameters used in building the model are referenced from [here](http://mymedialite.net/examples/datasets.html). We do not constrain the latent factors (`nonnegative = False`) in order to allow for both positive and negative preferences towards movies.\n",
                "Timing will vary depending on the machine being used to train."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 6,
            "metadata": {},
            "outputs": [],
            "source": [
                "header = {\n",
                "    \"userCol\": COL_USER,\n",
                "    \"itemCol\": COL_ITEM,\n",
                "    \"ratingCol\": COL_RATING,\n",
                "}\n",
                "\n",
                "\n",
                "als = ALS(\n",
                "    rank=10,\n",
                "    maxIter=15,\n",
                "    implicitPrefs=False,\n",
                "    regParam=0.05,\n",
                "    coldStartStrategy='drop',\n",
                "    nonnegative=False,\n",
                "    seed=42,\n",
                "    **header\n",
                ")"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 7,
            "metadata": {},
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": [
                        "Took 7.5410127229988575 seconds for training.\n"
                    ]
                }
            ],
            "source": [
                "with Timer() as train_time:\n",
                "    model = als.fit(train)\n",
                "\n",
                "print(\"Took {} seconds for training.\".format(train_time.interval))"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "In the movie recommendation use case, recommending movies that have been rated by the users do not make sense. Therefore, the rated movies are removed from the recommended items.\n",
                "\n",
                "In order to achieve this, we recommend all movies to all users, and then remove the user-movie pairs that exist in the training dataset."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 8,
            "metadata": {},
            "outputs": [
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "[Stage 126:====================================================>(198 + 2) / 200]\r"
                    ]
                },
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": [
                        "Took 25.246142672998758 seconds for prediction.\n"
                    ]
                },
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "\r\n",
                        "                                                                                \r"
                    ]
                }
            ],
            "source": [
                "with Timer() as test_time:\n",
                "\n",
                "    # Get the cross join of all user-item pairs and score them.\n",
                "    users = train.select(COL_USER).distinct()\n",
                "    items = train.select(COL_ITEM).distinct()\n",
                "    user_item = users.crossJoin(items)\n",
                "    dfs_pred = model.transform(user_item)\n",
                "\n",
                "    # Remove seen items.\n",
                "    dfs_pred_exclude_train = dfs_pred.alias(\"pred\").join(\n",
                "        train.alias(\"train\"),\n",
                "        (dfs_pred[COL_USER] == train[COL_USER]) & (dfs_pred[COL_ITEM] == train[COL_ITEM]),\n",
                "        how='outer'\n",
                "    )\n",
                "\n",
                "    top_all = dfs_pred_exclude_train.filter(dfs_pred_exclude_train[f\"train.{COL_RATING}\"].isNull()) \\\n",
                "        .select('pred.' + COL_USER, 'pred.' + COL_ITEM, 'pred.' + \"prediction\")\n",
                "\n",
                "    # In Spark, transformations are lazy evaluation\n",
                "    # Use an action to force execute and measure the test time \n",
                "    top_all.cache().count()\n",
                "\n",
                "print(\"Took {} seconds for prediction.\".format(test_time.interval))"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 9,
            "metadata": {},
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": [
                        "+------+-------+----------+\n",
                        "|UserId|MovieId|prediction|\n",
                        "+------+-------+----------+\n",
                        "|     1|    587| 4.1602826|\n",
                        "|     1|    869| 2.7732863|\n",
                        "|     1|   1208|  2.033383|\n",
                        "|     1|   1348| 1.0019257|\n",
                        "|     1|   1357| 0.9430026|\n",
                        "|     1|   1677| 2.8777318|\n",
                        "|     2|     80|  2.351385|\n",
                        "|     2|    472| 2.5865319|\n",
                        "|     2|    582| 3.9548612|\n",
                        "|     2|    838| 0.9482963|\n",
                        "|     2|    975| 3.1133535|\n",
                        "|     2|   1260| 1.9871743|\n",
                        "|     2|   1325| 1.2368056|\n",
                        "|     2|   1381| 3.5477588|\n",
                        "|     2|   1530|   2.08829|\n",
                        "|     3|     22| 3.1524537|\n",
                        "|     3|     57| 3.6980162|\n",
                        "|     3|     89| 3.9733813|\n",
                        "|     3|    367| 3.6629045|\n",
                        "|     3|   1091| 0.9144474|\n",
                        "+------+-------+----------+\n",
                        "only showing top 20 rows\n",
                        "\n"
                    ]
                }
            ],
            "source": [
                "top_all.show()"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### 4. Evaluate how well ALS performs"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 10,
            "metadata": {},
            "outputs": [
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "                                                                                \r"
                    ]
                }
            ],
            "source": [
                "rank_eval = SparkRankingEvaluation(test, top_all, k = TOP_K, col_user=COL_USER, col_item=COL_ITEM, \n",
                "                                    col_rating=COL_RATING, col_prediction=\"prediction\", \n",
                "                                    relevancy_method=\"top_k\")"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 11,
            "metadata": {},
            "outputs": [
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "[Stage 463:>                                                        (0 + 2) / 2]\r"
                    ]
                },
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": [
                        "Model:\tALS\n",
                        "Top K:\t10\n",
                        "MAP:\t0.006527\n",
                        "NDCG:\t0.051718\n",
                        "Precision@K:\t0.051274\n",
                        "Recall@K:\t0.018840\n"
                    ]
                },
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "\r\n",
                        "                                                                                \r"
                    ]
                }
            ],
            "source": [
                "print(\"Model:\\tALS\",\n",
                "      \"Top K:\\t%d\" % rank_eval.k,\n",
                "      \"MAP:\\t%f\" % rank_eval.map_at_k(),\n",
                "      \"NDCG:\\t%f\" % rank_eval.ndcg_at_k(),\n",
                "      \"Precision@K:\\t%f\" % rank_eval.precision_at_k(),\n",
                "      \"Recall@K:\\t%f\" % rank_eval.recall_at_k(), sep='\\n')"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### 5. Evaluate rating prediction"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 12,
            "metadata": {},
            "outputs": [
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "[Stage 500:=============================================>       (171 + 3) / 200]\r"
                    ]
                },
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": [
                        "+------+-------+------+---------+----------+\n",
                        "|UserId|MovieId|Rating|Timestamp|prediction|\n",
                        "+------+-------+------+---------+----------+\n",
                        "|   580|    148|   4.0|884125773| 3.4059548|\n",
                        "|   406|    148|   3.0|879540276| 2.7134619|\n",
                        "|   916|    148|   2.0|880843892| 2.2241986|\n",
                        "|   663|    148|   4.0|889492989|  2.714362|\n",
                        "|   330|    148|   4.0|876544781|   4.52321|\n",
                        "|   935|    148|   4.0|884472892| 4.3838587|\n",
                        "|   308|    148|   3.0|887740788| 2.6169493|\n",
                        "|    20|    148|   5.0|879668713| 4.3721194|\n",
                        "|   923|    148|   4.0|880387474| 3.9818575|\n",
                        "|   455|    148|   3.0|879110346| 3.0764186|\n",
                        "|    15|    148|   3.0|879456049| 2.9913845|\n",
                        "|   374|    148|   4.0|880392992| 3.2223384|\n",
                        "|   880|    148|   2.0|880167030| 2.8111982|\n",
                        "|   677|    148|   4.0|889399265| 3.8451843|\n",
                        "|    49|    148|   1.0|888068195| 1.3751594|\n",
                        "|   244|    148|   2.0|880605071| 2.6781514|\n",
                        "|    84|    148|   4.0|883452274| 3.6721768|\n",
                        "|   627|    148|   3.0|879530463| 2.6362069|\n",
                        "|   434|    148|   3.0|886724797| 3.0973828|\n",
                        "|   793|    148|   4.0|875104498| 2.2886577|\n",
                        "+------+-------+------+---------+----------+\n",
                        "only showing top 20 rows\n",
                        "\n"
                    ]
                },
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "\r\n",
                        "[Stage 500:=================================================>   (186 + 3) / 200]\r\n",
                        "\r\n",
                        "                                                                                \r"
                    ]
                }
            ],
            "source": [
                "# Generate predicted ratings.\n",
                "prediction = model.transform(test)\n",
                "prediction.cache().show()\n"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 13,
            "metadata": {},
            "outputs": [
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "[Stage 775:==============================================>      (174 + 2) / 200]\r"
                    ]
                },
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": [
                        "Model:\tALS rating prediction\n",
                        "RMSE:\t0.967434\n",
                        "MAE:\t0.753340\n",
                        "Explained variance:\t0.265916\n",
                        "R squared:\t0.259532\n"
                    ]
                },
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "\r\n",
                        "                                                                                \r"
                    ]
                }
            ],
            "source": [
                "rating_eval = SparkRatingEvaluation(test, prediction, col_user=COL_USER, col_item=COL_ITEM, \n",
                "                                    col_rating=COL_RATING, col_prediction=\"prediction\")\n",
                "\n",
                "print(\"Model:\\tALS rating prediction\",\n",
                "      \"RMSE:\\t%f\" % rating_eval.rmse(),\n",
                "      \"MAE:\\t%f\" % rating_eval.mae(),\n",
                "      \"Explained variance:\\t%f\" % rating_eval.exp_var(),\n",
                "      \"R squared:\\t%f\" % rating_eval.rsquared(), sep='\\n')"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 14,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "application/scrapbook.scrap.json+json": {
                            "data": 0.006527288768086336,
                            "encoder": "json",
                            "name": "map",
                            "version": 1
                        }
                    },
                    "metadata": {
                        "scrapbook": {
                            "data": true,
                            "display": false,
                            "name": "map"
                        }
                    },
                    "output_type": "display_data"
                },
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "                                                                                \r"
                    ]
                },
                {
                    "data": {
                        "application/scrapbook.scrap.json+json": {
                            "data": 0.051717802220247217,
                            "encoder": "json",
                            "name": "ndcg",
                            "version": 1
                        }
                    },
                    "metadata": {
                        "scrapbook": {
                            "data": true,
                            "display": false,
                            "name": "ndcg"
                        }
                    },
                    "output_type": "display_data"
                },
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "                                                                                \r"
                    ]
                },
                {
                    "data": {
                        "application/scrapbook.scrap.json+json": {
                            "data": 0.05127388535031851,
                            "encoder": "json",
                            "name": "precision",
                            "version": 1
                        }
                    },
                    "metadata": {
                        "scrapbook": {
                            "data": true,
                            "display": false,
                            "name": "precision"
                        }
                    },
                    "output_type": "display_data"
                },
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "\r\n",
                        "[Stage 904:>                                                        (0 + 2) / 2]\r\n",
                        "\r\n",
                        "                                                                                \r"
                    ]
                },
                {
                    "data": {
                        "application/scrapbook.scrap.json+json": {
                            "data": 0.018840283525491316,
                            "encoder": "json",
                            "name": "recall",
                            "version": 1
                        }
                    },
                    "metadata": {
                        "scrapbook": {
                            "data": true,
                            "display": false,
                            "name": "recall"
                        }
                    },
                    "output_type": "display_data"
                },
                {
                    "data": {
                        "application/scrapbook.scrap.json+json": {
                            "data": 0.9674342234414528,
                            "encoder": "json",
                            "name": "rmse",
                            "version": 1
                        }
                    },
                    "metadata": {
                        "scrapbook": {
                            "data": true,
                            "display": false,
                            "name": "rmse"
                        }
                    },
                    "output_type": "display_data"
                },
                {
                    "data": {
                        "application/scrapbook.scrap.json+json": {
                            "data": 0.7533395161385739,
                            "encoder": "json",
                            "name": "mae",
                            "version": 1
                        }
                    },
                    "metadata": {
                        "scrapbook": {
                            "data": true,
                            "display": false,
                            "name": "mae"
                        }
                    },
                    "output_type": "display_data"
                },
                {
                    "name": "stderr",
                    "output_type": "stream",
                    "text": [
                        "                                                                                \r"
                    ]
                },
                {
                    "data": {
                        "application/scrapbook.scrap.json+json": {
                            "data": 0.2659161968930053,
                            "encoder": "json",
                            "name": "exp_var",
                            "version": 1
                        }
                    },
                    "metadata": {
                        "scrapbook": {
                            "data": true,
                            "display": false,
                            "name": "exp_var"
                        }
                    },
                    "output_type": "display_data"
                },
                {
                    "data": {
                        "application/scrapbook.scrap.json+json": {
                            "data": 0.2595322728476255,
                            "encoder": "json",
                            "name": "rsquared",
                            "version": 1
                        }
                    },
                    "metadata": {
                        "scrapbook": {
                            "data": true,
                            "display": false,
                            "name": "rsquared"
                        }
                    },
                    "output_type": "display_data"
                },
                {
                    "data": {
                        "application/scrapbook.scrap.json+json": {
                            "data": 7.5410127229988575,
                            "encoder": "json",
                            "name": "train_time",
                            "version": 1
                        }
                    },
                    "metadata": {
                        "scrapbook": {
                            "data": true,
                            "display": false,
                            "name": "train_time"
                        }
                    },
                    "output_type": "display_data"
                },
                {
                    "data": {
                        "application/scrapbook.scrap.json+json": {
                            "data": 25.246142672998758,
                            "encoder": "json",
                            "name": "test_time",
                            "version": 1
                        }
                    },
                    "metadata": {
                        "scrapbook": {
                            "data": true,
                            "display": false,
                            "name": "test_time"
                        }
                    },
                    "output_type": "display_data"
                }
            ],
            "source": [
                "# Record results for tests - ignore this cell\n",
                "if is_jupyter():\n",
                "    store_metadata(\"map\", rank_eval.map_at_k())\n",
                "    store_metadata(\"ndcg\", rank_eval.ndcg_at_k())\n",
                "    store_metadata(\"precision\", rank_eval.precision_at_k())\n",
                "    store_metadata(\"recall\", rank_eval.recall_at_k())\n",
                "    store_metadata(\"rmse\", rating_eval.rmse())\n",
                "    store_metadata(\"mae\", rating_eval.mae())\n",
                "    store_metadata(\"exp_var\", rating_eval.exp_var())\n",
                "    store_metadata(\"rsquared\", rating_eval.rsquared())\n",
                "    store_metadata(\"train_time\", train_time.interval)\n",
                "    store_metadata(\"test_time\", test_time.interval)"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 15,
            "metadata": {},
            "outputs": [],
            "source": [
                "# cleanup spark instance\n",
                "spark.stop()"
            ]
        }
    ],
    "metadata": {
        "kernelspec": {
            "display_name": "Python (reco)",
            "language": "python",
            "name": "reco"
        },
        "language_info": {
            "codemirror_mode": {
                "name": "ipython",
                "version": 3
            },
            "file_extension": ".py",
            "mimetype": "text/x-python",
            "name": "python",
            "nbconvert_exporter": "python",
            "pygments_lexer": "ipython3",
            "version": "3.8.0"
        }
    },
    "nbformat": 4,
    "nbformat_minor": 1
}
